Triangulation of Reordering Tables: An Advancement Over Phrase Table Triangulation in Pivot-Based SMT

نویسندگان

  • Deepak Patil
  • Harshad Chavan
  • Pushpak Bhattacharyya
چکیده

Triangulation in Pivot-Based Statistical Machine Translation(SMT) is a very effective method for building Machine Translation(MT) systems in case of scarcity of the parallel corpus. Phrase Table Triangulation helps in such a resource constrained setting by inducing new phrase pairs with the help of a pivot. However, it does not explore the possibility of extracting reordering information through the use of pivot. This paper presents a novel method for triangulation of reordering tables in Pivot Based SMT. We show that the use of a pivot can help in extracting better reordering information and can assist in improving the quality of the translation. With a detailed example, we show that triangulation of reordering tables also improves the lexical choices a system makes during translation. We observe a BLEU score improvement of 1.06 for Marathi to English MT system with Hindi as a pivot, and also significant improvements in 8 other translation systems by using this method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Literature Survey: Study of Reordering in Pivot Based SMT

Pivot Based SMT solves the problem of scarcity of source-target parallel corpus by introducing a third resource rich ‘pivot’ language. Triangulation method in Pivot Based SMT is a method that uses the pivot language to induce new phrase pairs into the phrase table, this process is known as ‘Phrase Table Triangulation’. Phrase Table Triangulation has been extensively studied by many researchers....

متن کامل

Evaluating Indirect Strategies for Chinese - Spanish Statistical Machine Translation: Extended Abstract

Although, Chinese and Spanish are two of the most spoken languages in the world, not much research has been done in machine translation for this language pair. This paper focuses on investigating the state-of-the-art of Chinese-to-Spanish statistical machine translation (Smt), which nowadays is one of the most popular approaches to machine translation. For this purpose, we report details of the...

متن کامل

Machine Translation by Triangulation: Making Effective Use of Multi-Parallel Corpora

Current phrase-based SMT systems perform poorly when using small training sets. This is a consequence of unreliable translation estimates and low coverage over source and target phrases. This paper presents a method which alleviates this problem by exploiting multiple translations of the same source phrase. Central to our approach is triangulation, the process of translating from a source to a ...

متن کامل

Tree as a Pivot: Syntactic Matching Methods in Pivot Translation

Pivot translation is a useful method for translating between languages with little or no parallel data by utilizing parallel data in an intermediate language such as English. A popular approach for pivot translation used in phrase-based or tree-based translation models combines source-pivot and pivot-target translation models into a source-target model, as known as triangulation. However, this ...

متن کامل

Improving Machine Translation via Triangulation and Transliteration

In this paper we improve Urdu→Hindi English machine translation through triangulation and transliteration. First we built an Urdu→Hindi SMT system by inducing triangulated and transliterated phrase-tables from Urdu–English and Hindi–English phrase translation models. We then use it to translate the Urdu part of the Urdu-English parallel data into Hindi, thus creating an artificial Hindi-English...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015